Specify the task CPU and memory (IT-4056) #83

tschaffter · 2024-12-13T05:04:43Z

Closes https://sagebionetworks.jira.com/browse/IT-4056

Changelog

Add a class for specifying valid task CPU and memory pairs.
Set container max memory to align with task max memory.

tschaffter · 2024-12-13T16:33:04Z

I realize that each task must share its resources with two more containers:

ecs-service-connect: no CPU and memory limit visible in AWS Console.
aws-guardduty-agent: hard max memory limit set to .125 GB.

One solution is to increment the memory made available to the task. The increment is usually 1 GB, which costs $3.20 / month (see Fargate pricing). So for 13 tasks, that about $40 / month / environment or a total of $1440 / year.

Alternatively, we could keep the memory allocated to the tasks as defined in this PR, but reduce the memory allocated to the OC container. Unlike the task definition, containers added to a task can have their memory set freely (no fixed increments) as long as the value is not larger that the memory allocated to the task.

memory_limit_mib (Union[int, float, None]) – The amount (in MiB) of memory to present to the container. If your container attempts to exceed the allocated memory, the container is terminated. At least one of memoryLimitMiB and memoryReservationMiB is required for non-Fargate services. Default: - No memory limit.

BryanFauble · 2024-12-17T19:19:18Z

ecs-service-connect: no CPU and memory limit visible in AWS Console.

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-connect-concepts-deploy.html#service-connect-concepts-proxy

These are the numbers they recommend:

For task definitions, you must set the CPU and memory parameters.

We recommend adding an additional 256 CPU units and at least 64 MiB of memory to your task CPU and memory for the Service Connect proxy container. On AWS Fargate, the lowest amount of memory that you can set is 512 MiB of memory. On Amazon EC2, task definition memory is required.

For the service, you set the log configuration in the Service Connect configuration.

If you expect tasks in this service to receive more than 500 requests per second at their peak load, we recommend adding 512 CPU units to your task CPU in this task definition for the Service Connect proxy container.
If you expect to create more than 100 Service Connect services in the namespace or 2000 tasks in total across all Amazon ECS services within the namespace, we recommend adding 128 MiB of memory to your task memory for the Service Connect proxy container.
You should do this in every task definition that is used by all of the Amazon ECS services in the namespace.

Alternatively, we could keep the memory allocated to the tasks as defined in this PR, but reduce the memory allocated to the OC container. Unlike the task definition, containers added to a task can have their memory set freely (no fixed increments) as long as the value is not larger that the memory allocated to the task.

I like this idea as the first approach. We can then use cloudwatch metrics and adjust from there if we need to bump up to the next valid memory/cpu config.

zaro0508 · 2024-12-23T17:55:13Z

openchallenges/fargate_cpu_memory.py

+from enum import Enum
+
+
+class FargateCpuMemory(Enum):


i'm not a fan of this because these combinations are already kept in AWS and they could change over time. Keeping a copy here would introduce a maintenance burden of keeping this up to date. Instead how about we just query AWS for the correct combination and if the user doesn't provide the right combination throw an exception and provide a link for the user to lookup valid combos in AWS?

zaro0508 · 2024-12-23T18:05:35Z

One solution is to increment the memory made available to the task. The increment is usually 1 GB, which costs $3.20 / month (see Fargate pricing). So for 13 tasks, that about $40 / month / environment or a total of $1440 / year.

Alternatively, we could keep the memory allocated to the tasks as defined in this PR, but reduce the memory allocated to the OC container. Unlike the task definition, containers added to a task can have their memory set freely (no fixed increments) as long as the value is not larger that the memory allocated to the task.

I'm not a fan of either of these solutions because both would require management of both task and container memories. I suggestion we change to only set the task cpu and memory and don't set the container memory at all. This would allow all of the containers in a ECS task to share the cpu and memory defined at the task level. This seems like the easiest solution. This article on how ECS memory and cpu settings work helped me understand how those settings work, particularly the section on Scenarios for different memory configurations

tschaffter added 2 commits December 13, 2024 04:50

Parametrize task cpu and memory

901aad6

Specify the container memory

c035828

tschaffter self-assigned this Dec 13, 2024

tschaffter marked this pull request as ready for review December 13, 2024 05:06

tschaffter requested review from a team as code owners December 13, 2024 05:06

zaro0508 requested changes Dec 23, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specify the task CPU and memory (IT-4056) #83

Specify the task CPU and memory (IT-4056) #83

tschaffter commented Dec 13, 2024

tschaffter commented Dec 13, 2024

BryanFauble commented Dec 17, 2024

zaro0508 Dec 23, 2024

zaro0508 commented Dec 23, 2024 •

edited

Loading

Specify the task CPU and memory (IT-4056) #83

Are you sure you want to change the base?

Specify the task CPU and memory (IT-4056) #83

Conversation

tschaffter commented Dec 13, 2024

Changelog

tschaffter commented Dec 13, 2024

BryanFauble commented Dec 17, 2024

zaro0508 Dec 23, 2024

Choose a reason for hiding this comment

zaro0508 commented Dec 23, 2024 • edited Loading

zaro0508 commented Dec 23, 2024 •

edited

Loading